Thai type style recognition

نویسندگان

  • Chularat Tanprasert
  • Sutat Sae-Tang
چکیده

Thai typed character recognition has been a very most popular research topic in Thailand. There are three commercial Thai OCR softwares available to the public at the present. But none of them can preserve the type styles of the original document image such as bold, italics, normal, and bold & italics styles into the output text file. This paper presents the technique for preserving the specified Thai type styles by applying a specific preprocessing with a supervised neural networks (NNs) learning algorithm. Experiments have been conducted and the results confirm that the proposed technique effectively preserve the type styles of Thai typed fonts from the original document image into the output text file.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic and style-adapted language modeling for Thai broadcast news ASR

The amount of available Thai broadcast news transcribed text for training a language model is still very limited, comparing to other major languages. Since the construction of a broadcast news corpus is very costly and time-consuming, newspaper text is often used to increase the size of training text data. This paper proposes a language model topic and style adaptation approach for a Thai broad...

متن کامل

Speed Compensation for Improving Thai Spelling Recognition with a Continuous Speech Corpus

Spelling recognition is an approach to enhance a speech recognizer to cope with incorrectly recognized words and out-of-vocabulary words. This paper presents a general framework for Thai speech recognition, enhanced with spelling recognition. To implement Thai spelling recognition, Thai alphabets and their spelling methods are analyzed. Based on hidden Markov models, we propose a method to cons...

متن کامل

Structural Modeling of Fundamental Frequency Contour for Thai Expressive Speech

Problem statement: Appropriate modeling of fundamental Frequency (F0) contour for speech is a key factor to preserve the quality of speech prosody. One successful approach has been conducted for tonal language of Mandarin Chinese. It is based on the assumption that the behavioral characteristics of vocal-fold elongation in vibration could be approximated by those of a simple forced vibrating sy...

متن کامل

Font Descriptor Construction for Printed Thai Character Recognition

The font evolution with various types is a great impact on a recognition performance of optical character recognition (OCR) systems. The more diversity of fonts leads to the less accuracy of recognition rate, particularly Thai-fonts. In order to overcome this obstacle, this paper proposes a font descriptor for printed Thai-character recognition. The role of such a descriptor is a representative...

متن کامل

Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki’s Model and Structural Model

Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplishe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999